Goto

Collaborating Authors

 Valparaíso


A Survey of Explainable Reinforcement Learning: Targets, Methods and Needs

Saulières, Léo

arXiv.org Artificial Intelligence

The success of recent Artificial Intelligence (AI) models has been accompanied by the opacity of their internal mechanisms, due notably to the use of deep neural networks. In order to understand these internal mechanisms and explain the output of these AI models, a set of methods have been proposed, grouped under the domain of eXplainable AI (XAI). This paper focuses on a sub-domain of XAI, called eXplainable Reinforcement Learning (XRL), which aims to explain the actions of an agent that has learned by reinforcement learning. We propose an intuitive taxonomy based on two questions "What" and "How". The first question focuses on the target that the method explains, while the second relates to the way the explanation is provided. We use this taxonomy to provide a state-of-the-art review of over 250 papers. In addition, we present a set of domains close to XRL, which we believe should get attention from the community. Finally, we identify some needs for the field of XRL.


Convolutional Fourier Analysis Network (CFAN): A Unified Time-Frequency Approach for ECG Classification

Jeong, Sam, Kim, Hae Yong

arXiv.org Artificial Intelligence

Machine learning has transformed the classification of biomedical signals such as electrocardiograms (ECGs). Advances in deep learning, particularly convolutional neural networks (CNNs), enable automatic feature extraction, raising the question: Can combining time- and frequency-domain attributes enhance classification accuracy? To explore this, we evaluated three ECG classification tasks: (1) arrhythmia classification, (2) identity recognition, and (3) apnea detection. We initially tested three methods: (i) 2-D spectrogram-based frequency-time classification (SPECT), (ii) time-domain classification using a 1-D CNN (CNN1D), and (iii) frequency-domain classification using a Fourier transform-based CNN (FFT1D). Performance was validated using K-fold cross-validation. Among these, CNN1D (time only) performed best, followed by SPECT (time-frequency) and FFT1D (frequency only). Surprisingly, SPECT, which integrates time- and frequency-domain features, performed worse than CNN1D, suggesting a need for a more effective time and frequency fusion approach. To address this, we tested the recently proposed Fourier Analysis Network (FAN), which combines time- and frequency-domain features. However, FAN performed comparably to CNN1D, excelling in some tasks while underperforming in others. To enhance this approach, we developed the Convolutional Fourier Analysis Network (CFAN), which integrates FAN with CNN. CFAN outperformed all previous methods across all classification tasks. These findings underscore the advantages of combining time- and frequency-domain features, demonstrating CFAN's potential as a powerful and versatile solution for ECG classification and broader biomedical signal analysis


Deep Self-Supervised Disturbance Mapping with the OPERA Sentinel-1 Radiometric Terrain Corrected SAR Backscatter Product

Hardiman-Mostow, Harris, Marshak, Charles, Handwerger, Alexander L.

arXiv.org Artificial Intelligence

Mapping land surface disturbances supports disaster response, resource and ecosystem management, and climate adaptation efforts. Synthetic aperture radar (SAR) is an invaluable tool for disturbance mapping, providing consistent time-series images of the ground regardless of weather or illumination conditions. Despite SAR's potential for disturbance mapping, processing SAR data to an analysis-ready format requires expertise and significant compute resources, particularly for large-scale global analysis. In October 2023, NASA's Observational Products for End-Users from Remote Sensing Analysis (OPERA) project released the near-global Radiometric Terrain Corrected SAR backscatter from Sentinel-1 (RTC-S1) dataset, providing publicly available, analysis-ready SAR imagery. In this work, we utilize this new dataset to systematically analyze land surface disturbances. As labeling SAR data is often prohibitively time-consuming, we train a self-supervised vision transformer - which requires no labels to train - on OPERA RTC-S1 data to estimate a per-pixel distribution from the set of baseline imagery and assess disturbances when there is significant deviation from the modeled distribution. To test our model's capability and generality, we evaluate three different natural disasters - which represent high-intensity, abrupt disturbances - from three different regions of the world. Across events, our approach yields high quality delineations: F1 scores exceeding 0.6 and Areas Under the Precision-Recall Curve exceeding 0.65, consistently outperforming existing SAR disturbance methods. Our findings suggest that a self-supervised vision transformer is well-suited for global disturbance mapping and can be a valuable tool for operational, near-global disturbance monitoring, particularly when labeled data does not exist.


Cosmology with Persistent Homology: Parameter Inference via Machine Learning

Calles, Juan, Yip, Jacky H. T., Contardo, Gabriella, Noreña, Jorge, Rouhiainen, Adam, Shiu, Gary

arXiv.org Artificial Intelligence

Building upon [2308.02636], this article investigates the potential constraining power of persistent homology for cosmological parameters and primordial non-Gaussianity amplitudes in a likelihood-free inference pipeline. We evaluate the ability of persistence images (PIs) to infer parameters, compared to the combined Power Spectrum and Bispectrum (PS/BS), and we compare two types of models: neural-based, and tree-based. PIs consistently lead to better predictions compared to the combined PS/BS when the parameters can be constrained (i.e., for $\{\Omega_{\rm m}, \sigma_8, n_{\rm s}, f_{\rm NL}^{\rm loc}\}$). PIs perform particularly well for $f_{\rm NL}^{\rm loc}$, showing the promise of persistent homology in constraining primordial non-Gaussianity. Our results show that combining PIs with PS/BS provides only marginal gains, indicating that the PS/BS contains little extra or complementary information to the PIs. Finally, we provide a visualization of the most important topological features for $f_{\rm NL}^{\rm loc}$ and for $\Omega_{\rm m}$. This reveals that clusters and voids (0-cycles and 2-cycles) are most informative for $\Omega_{\rm m}$, while $f_{\rm NL}^{\rm loc}$ uses the filaments (1-cycles) in addition to the other two types of topological features.


Machine learning-based probabilistic forecasting of solar irradiance in Chile

Baran, Sándor, Marín, Julio C., Cuevas, Omar, Díaz, Mailiu, Szabó, Marianna, Nicolis, Orietta, Lakatos, Mária

arXiv.org Machine Learning

By the end of 2023, renewable sources cover 63.4% of the total electric power demand of Chile, and in line with the global trend, photovoltaic (PV) power shows the most dynamic increase. Although Chile's Atacama Desert is considered the sunniest place on Earth, PV power production, even in this area, can be highly volatile. Successful integration of PV energy into the country's power grid requires accurate short-term PV power forecasts, which can be obtained from predictions of solar irradiance and related weather quantities. Nowadays, in weather forecasting, the state-of-the-art approach is the use of ensemble forecasts based on multiple runs of numerical weather prediction models. However, ensemble forecasts still tend to be uncalibrated or biased, thus requiring some form of post-processing. The present work investigates probabilistic forecasts of solar irradiance for Regions III and IV in Chile. For this reason, 8-member short-term ensemble forecasts of solar irradiance for calendar year 2021 are generated using the Weather Research and Forecasting (WRF) model, which are then calibrated using the benchmark ensemble model output statistics (EMOS) method based on a censored Gaussian law, and its machine learning-based distributional regression network (DRN) counterpart. Furthermore, we also propose a neural network-based post-processing method resulting in improved 8-member ensemble predictions. All forecasts are evaluated against station observations for 30 locations, and the skill of post-processed predictions is compared to the raw WRF ensemble. Our case study confirms that all studied post-processing methods substantially improve both the calibration of probabilistic- and the accuracy of point forecasts. Among the methods tested, the corrected ensemble exhibits the best overall performance. Additionally, the DRN model generally outperforms the corresponding EMOS approach.


Structure learning with Temporal Gaussian Mixture for model-based Reinforcement Learning

Champion, Théophile, Grześ, Marek, Bowman, Howard

arXiv.org Machine Learning

Model-based reinforcement learning refers to a set of approaches capable of sample-efficient decision making, which create an explicit model of the environment. This model can subsequently be used for learning optimal policies. In this paper, we propose a temporal Gaussian Mixture Model composed of a perception model and a transition model. The perception model extracts discrete (latent) states from continuous observations using a variational Gaussian mixture likelihood. Importantly, our model constantly monitors the collected data searching for new Gaussian components, i.e., the perception model performs a form of structure learning (Smith et al., 2020; Friston et al., 2018; Neacsu et al., 2022) as it learns the number of Gaussian components in the mixture. Additionally, the transition model learns the temporal transition between consecutive time steps by taking advantage of the Dirichlet-categorical conjugacy. Both the perception and transition models are able to forget part of the data points, while integrating the information they provide within the prior, which ensure fast variational inference. Finally, decision making is performed with a variant of Q-learning which is able to learn Q-values from beliefs over states. Empirically, we have demonstrated the model's ability to learn the structure of several mazes: the model discovered the number of states and the transition probabilities between these states. Moreover, using its learned Q-values, the agent was able to successfully navigate from the starting position to the maze's exit.


Compact Optimality Verification for Optimization Proxies

Chen, Wenbo, Zhao, Haoruo, Tanneau, Mathieu, Van Hentenryck, Pascal

arXiv.org Artificial Intelligence

Recent years have witnessed increasing interest in optimization proxies, i.e., machine learning models that approximate the input-output mapping of parametric optimization problems and return near-optimal feasible solutions. Following recent work by (Nellikkath & Chatzivasileiadis, 2021), this paper reconsiders the optimality verification problem for optimization proxies, i.e., the determination of the worst-case optimality gap over the instance distribution. The paper proposes a compact formulation for optimality verification and a gradient-based primal heuristic that brings substantial computational benefits to the original formulation. The compact formulation is also more general and applies to non-convex optimization problems. The benefits of the compact formulation are demonstrated on large-scale DC Optimal Power Flow and knapsack problems.


Automatic Navigation Map Generation for Mobile Robots in Urban Environments

Mozzarelli, Luca, Specchia, Simone, Corno, Matteo, Savaresi, Sergio Matteo

arXiv.org Artificial Intelligence

A fundamental prerequisite for safe and efficient navigation of mobile robots is the availability of reliable navigation maps upon which trajectories can be planned. With the increasing industrial interest in mobile robotics, especially in urban environments, the process of generating navigation maps has become of particular interest, being a labor intensive step of the deployment process. Automating this step is challenging and becomes even more arduous when the perception capabilities are limited by cost considerations. This paper proposes an algorithm to automatically generate navigation maps using a typical navigation-oriented sensor setup: a single top-mounted 3D LiDAR sensor. The proposed method is designed and validated with the urban environment as the main use case: it is shown to be able to produce accurate maps featuring different terrain types, positive obstacles of different heights as well as negative obstacles. The algorithm is applied to data collected in a typical urban environment with a wheeled inverted pendulum robot, showing its robustness against localization, perception and dynamic uncertainties. The generated map is validated against a human-made map.


Safe reinforcement learning in uncertain contexts

Baumann, Dominik, Schön, Thomas B.

arXiv.org Artificial Intelligence

When deploying machine learning algorithms in the real world, guaranteeing safety is an essential asset. Existing safe learning approaches typically consider continuous variables, i.e., regression tasks. However, in practice, robotic systems are also subject to discrete, external environmental changes, e.g., having to carry objects of certain weights or operating on frozen, wet, or dry surfaces. Such influences can be modeled as discrete context variables. In the existing literature, such contexts are, if considered, mostly assumed to be known. In this work, we drop this assumption and show how we can perform safe learning when we cannot directly measure the context variables. To achieve this, we derive frequentist guarantees for multi-class classification, allowing us to estimate the current context from measurements. Further, we propose an approach for identifying contexts through experiments. We discuss under which conditions we can retain theoretical guarantees and demonstrate the applicability of our algorithm on a Furuta pendulum with camera measurements of different weights that serve as contexts.


A Self-Commissioning Edge Computing Method for Data-Driven Anomaly Detection in Power Electronic Systems

Gomez, Pere Izquierdo, Gajardo, Miguel E. Lopez, Mijatovic, Nenad, Dragicevic, Tomislav

arXiv.org Artificial Intelligence

Ensuring the reliability of power electronic converters is a matter of great importance, and data-driven condition monitoring techniques are cementing themselves as an important tool for this purpose. However, translating methods that work well in controlled lab environments to field applications presents significant challenges, notably because of the limited diversity and accuracy of the lab training data. By enabling the use of field data, online machine learning can be a powerful tool to overcome this problem, but it introduces additional challenges in ensuring the stability and predictability of the training processes. This work presents an edge computing method that mitigates these shortcomings with minimal additional memory usage, by employing an autonomous algorithm that prioritizes the storage of training samples with larger prediction errors. The method is demonstrated on the use case of a self-commissioning condition monitoring system, in the form of a thermal anomaly detection scheme for a variable frequency motor drive, where the algorithm self-learned to distinguish normal and anomalous operation with minimal prior knowledge. The obtained results, based on experimental data, show a significant improvement in prediction accuracy and training speed, when compared to equivalent models trained online without the proposed data selection process.